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ABS'xRACT 
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instructional usefulness of performance assessment is currently 
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PERFORMANCE ASSESSMENT IN EDUCATION 



Introduction 

Perfonnance assessment is certainly not new to education. Teachers have always been observers of 
student behavior (Sii£^ins & BrklgefonI, 1 985). Oral examinations have been used to determine student 
progress and leamii^ for hundreds if not thousands of years. And beyond these practical applications, 
pefformance assessment has been the focus of scholarly activity for decades. Each of the past four 
decades has produced at least one ma^r update of research and devetopment in perfonnance assessment 
(Ryans & Frsderidc. 1951 ; Glaser and Klaus, 1962; Rtzpatrtek and Monlson. 1977; and. Berk, 1986). 

What new in the 1990s is the "holy Grair emphasis cunantly being attached to performance assessment in 
education. The combination of inoeasir^ demands for acoountsdSDity and £he desire to measure for a wkle 
variety of more complex educational outcomes makes the use of perfonnance assessment an its gitises- 
portfolios. assessment ceitf ers, systematte classroom observation, stmctured tasks-an essential addition to 
the array of toote used to profile student achievemeitf . 

Because of the headlor^ ntsh to use performance assessment methcxtology. we have gained a great deal of 
experience over the past few years both with its appropriate and inappropriate use. For this reason, this Is a 
excellent time to think about ecfucationat performance assessment in ail its forms and to ponder what weVe 
learned so far. 

Overview of Performance Assessment In Education 

Performance assessments are used in many wa^ in education because of U^e multitude of purposes for 
these assessments, achievement targets to be addressed, and populations to assess. inciudinQ teachers, 
students, and administrators. Table 1 presents a summaty of such applications. We will briefly explain 
these applicattons and illustrate each with a few examples. 



Insert Table 1 about here 



Assessment of Student Performance 

Purposes. Assessments of student performance serve a variety of purposes. As classroom tools, they can 
infonn specific instructional decisbns made by teachers, students and parents, and serve as teaching and 
learning toois for both teachers and students. Teachers use performance assessment methodology to 
engage students in the assessment of their own and each other's performance as a means of becoming 
more accorrpllshed perfonners. 

in fact, one of the most im|»>rfant developments in the classroom use of this methodology over the past 
decade has been tfie realization that the entire process of performance assessment can be a powerful 
instructional tool. For example, conskfer the six-trait analytical scoring method used for student writing In 
grades 3-12 In Oregon. This procedure was originally developed in Beaverton School Dlstrirt in 1984 by 
teachers seeking to imp.xTve upon holistk; scoring as a means of providing feedback to student writers. The 
six traits of Ideas, organization, voice, word choice, sentence fluency, and conventions are used to analyze 
all types of student writing. Because the six traits describe good writing, many teachers are helping students 
to look fot these characteristics in their own writing and that of others. This provides students (and teachers) 
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a common vocabulary for communicating about and devetoping sound writing habits. (For more Information 
on imegratlf^ analytical scoring into instruction see Spandet & Sti^^. 1989. and SpandeU 1992.) 

Educators also use performance assessment of students for acoouitfat}Hity (xjrposes (i-e... to communicate 
to oommunftles about student acMevemem). to inform building or district decision making (program 
evaluation, student certification), and for selection and placeiient decisions, either Into advanced or remedial 
programs or into cotlega. 

Adcfitlonally. we have teamed that perfonnance assessment methodology holds the potential of helping us 
communicale to students and others what we value (Wiggins. 1988). If we vaiue problem solving, 
cooperative learning, bitegrating wrfting and math acmss the cunlculum, and critk^ thinldng, wir 
assessment must reflect this fact. Examples of attempts to use large-scale assessment to corr^rmnicatd 
valued outcomes are (Dorm&cticut's sdence peif ormance assessments whidi require students Ic ooo|»rate 
(Baron. 1990) and Vennonfs induston of student dispositions (i.e... attitudes about the subject) in their 
scorify criteria for mathematics portfolios (Vermont Department of Education. 1991). 

Targ^. The lUnds of student ^hievennent targets being translated into performance assessments are 
those that requb'e obsen^ation and considered professional judgment as the basis for evaluation. If we think 
of the various kirxis of valued outoorros in educaton as beir^ classified as (1) ti^e mastery (rf subject matter 
knowledge (2) the use of that knowledge to solve problems. (3) the exhibitton of certain kinds of valued 
behavtors. (4) the creation of products that possess certain attributes aiKt (5) the acc^isitton of certain 
affective responses, then perfonnance assessments are primarily being used in the contexts of categories 
two. three aiKi four. 

For example. Oregon has developed and is pilot testing an analytical trait scoring scheme to evaluate 
mathematical problem solving for use in grades 3 and 8 as part of its state assessment. The four traits are 
conceptual understanding of the problem, procedural knowledge, problem solving skills and strategies, and 
comnunicationn The target is problem solving skill in mathematics and the ability to successfully 
communicate one's thinklr^. 

Another example of the broad range of targets for studem p&rtomnance assessment is interactive speaking 
and listening-how well, for example, can students Interact vert)atly with others in group discusstons, social 
Interactions, interviews, and instroction. The English Language Sknis Profile (Hutchinson & Pollitt, 1987) has 
one exercise that Involves a group dlscu$.<sion and another called a "paired interview." In the faired inten/iew 
pupils are given written information about a proFK}sed (xjmnuinity project invoh^ing young people, and asked 
to discuss, in pairs. vark>us aspects of its implementation with a view to decision making. There is a third 
person available to provkte additional information upon request of the students. Studems are assessed on 
their ability to interact appropriately with each other and the third person, appropriateness of comments, 
clarity of expression, willingness to cooperate, and the degree of support needed to o^mplete the task. 

Methods. The methods being used for student perfonnance assessments vary in tenns of the tasks that 
students perform, and In the criteria used to judge performance on the tasks. 

Tasks include sinuiations, stnjctured performance assessment tasks, portfolios and classroom exeroises. 
The iTtost common designs are direct observation of ongoing classroom events and the development and 
administration of structured performance assessment tasks. The former type of assessment terKis to be 
informal, using checklists and rating fom^ devetoped without pitot testing. For example, the British Columbia 
Ministry of Education produced a document to assist classroom teachers with evaluation and instmction of 
oral communteatton (Jeroski, et. al, 1988). The handbooks contain a large number of checklists, observation 
forms, peer reviews, and setf-reflec^on instruments for informal use in the classroom. 

More structured performance assessment tasks are being designed in a number of contexts (see, for 
example. Baron. 1990; National Assessment of Educational Progress, 1987; CalHomia State Department of 
Educatton, 1,989; Kanis, 1991 ; Larter, 1991 ; and Whetton. 1991). An example of a computer simulation Is 
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presented by Shavelson, ei al (in press). In which the authors compare assessment of hands-on science 
latx>ratoiy tasks to several sunx>gates, such as lab noteboolcs, coni^Miter sinuiations and standardized test 
scores. 

In ackiitton to sound perfomnarKd tasks, arnther key to effective student performance assessment the 
careful devetopnwm and applk^ton of proper criteria to use in the judgment process. Sows assessments 
have criteria tfed directiy to the task, so that different criteifa are developed for each task (e.g.. Califomia 
Depaitment of Education. 1989 and Latter. 1991 ). Others, such as those involved in many writing 
assessments, find it mors pnxftjctive to develop tvoader cffteria that can applied aooss various tasks. 
For examf^. instead of looking for var^s features in a re${.«n$e (such as the presence of a graph), one 
would took for the abflity of the student to emptoy appropi^e solution meth jds. be flexible in the methods 
used, and switch methocte when needed. The latter approach is more difficult to use because it requires that 
teachers and other users of the assessment completely understand what flexibility or effldet^ really means 
and todk& 8ke across tasks. 

One combination of tasks and criteria being given a great deal of attention in education these days is the use 
of the achievement portfolio. This application calls for the acojnKiiatton of exanples of student work over 
time in a context where dear criteria have been established for the selection of entries into the portfolto, 
crfteria have t>een developed for evaiuatit^ the work collected, ar^ students play a key role in the 
assessment process by reflecting in a systematic manner on changes in their achievement as depicted by 
the wojk ooHected. One excellent exanv>le of this kind of assessment can be found in the work of Juneau 
Borough School District (Calkins, 1991). Each student porifolto includes several sanpies of student writing 
and reading cotiected at various times in the school year. Student progress is systematically rated using 
devekipmer^ conlinua. Students have input in deciding what will placed into their portfoltos. and have the 
opportunity to explain why the pieces were selected for the portfolio and how he or she feels about him or 
herself as a reader and writer. 

More information afciout periomnance assessment aitematives currently in use across the country is available 
from the Test Center at Northwest Regional Educational t^boratory^. and in After (1989) and Arier & 
Spandel (1992). 

Assessment of Teacher Performance 

Purposes. The major reasons for conducting performance assessments of teachers are admission Into 
teacher training programs, certification and licensure to teach, promoting professional development, 
accountability, and assuring minimum corr^fence. For example, the California New Teacher Project has 
Ijeen exploring the use of performarK:e assessments for teacher certfflcatton for several years. One set of 
prototypes (Murray, et. al. 1990) involved four simulations in which prospective teachers watched videotapes 
of typical language arts classroom situations and then answered a series of open-ended questions to assess 
their pedagogical knowledge. Periormance criteria were developed to match each task. Stiggins and Duke 
(1989) dr^ a starts contrast between the uses of teacher periormance assessment for professional 
development and accountability, depteting the key elements of assessments used for the former. 

Targets. The aspects of teacher perfomriance assessed include classroom management skills, instroctionat 
skills, and communication skills. anK>ng others. A wide variety of observational instruments and schemes for 
analyzing teacher's classroom behavior and products is collected in Good and Brophy (1987). 

Methods. The performance assessment methods being used for teachers include classroom obsen^ation, 
portfolios, assessment centers, and simulations. For example, the Teacher Assessment Project a\ Stanford 
University tried portfolios and assessment center techniques (Teacher Assessment Project, 1988. 1989a. 
1989b, 1989) to assess teacher subject area knowledge, pedagogical knowledge, and attitudes in biology 
and elementary frteracy. For the literacy portfolio, teachers are asked to select four items that related to 
integrated language instruction, three that relate to creating a literate environment, and four about 
assessment of students. Teachers may also present an open entry and a reflective Inteipretation of any and 
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aH entries. The related assessment center experience includes six exercises, some of which draw on the 
teachei's poftrollo. Other exercises simulate teaching situations. 

Assessment of Principal Performance 

Purposes. Ti^ principle reasons for conducting performance assessments of principals are hiring, 
professionai development, and accountability (fonnai peifonnance reviews). 

The foren^ example of perfonnance assessment of prir^als for placement arKi professional devetopment 
are the Assessntent Centers operated under the auspices of the National Association of Secondary tchool 
Piinc^ls^. Over a three-day pertod at the center, prirK^pais are fawolved in six to eight exeitises-leadership 
group exerc^. in-basket exercises, fact-finding exercises, and stnidured interviews. Performance 
observed by trained assessors who lool< for specific behavtors that are translated into scores in 12 areas- 
problem analysis, iudpnertt. organbcational ability, decisiveness, leadership, sensith^. stress tolerance, oral 
commur ation. written convnunicatfon, range of interest, personal nKJtivation. and ec&icational values. 

Tai^els. The targets of performance assessment of prirK^ipate are a variety of behaviors, styles. araJ skills. 
Management sidtis include such things as managing the budget, assessing student progress, and making the 
sctK)ol nin smoothly. Leadership involves vision settir^, inspiring others to act, nwdeling the way. and 
inspiring the heart (Kouzes & Posner, 1968). Personality traits invalid such things as tolerance of ambiguity, 
sensitivity, and motivation. Instniments assessing styles have focused on such things as participatory 
leadership or directive leadership. Other knowledge and skills incitnle ability to comrminicate orally and in 
writing axrsS abiltty to solve problems. The twelve areas rated in the Assessment Center example cited above 
cover many of these dinrensions. 

MettKKfs. Assessment tactics used for pf1ncq)al assessments have included stiuctured inten/iews and on- 
the-job ot>servatk)n. as well as the assessment center and tn-basket tasks described at>ove. For examp' 
the Stuatfonal Leadership Instrument Package (Mersey, et. al, 1982) includes an observational checklis* 
(called the Interaction influence Analysis) in which an observer keeps track of nine t}ehavlors during an 
tnteractk>n between a leader and a subordinate. 

Reviews of additional instruments for assessing the leadership and management qualities of school 
administrators (most of which are questionnaires and sun^eys) are available in After (1990). 

Lessons Learned to Date 

The Need For Clear Targets 

"Hie recent surge in interest in and devetopment of performance assessments in education has brought 
benefits with it. For example, the obvious need to base subjective evaluations of sound criteria applied by 
carefully trained raters has necessitated the artk^latlon of dear visions of the meaning of good perfonnance. 
The 1980s was the decade for reexamining the valued outcomes of education. As that work has proceeded 
around the development of sound perfonnance assessment criteria, we have acquired far clearer 
understandings of what it means, for example, to be an effective writer, reader, speaker, etc. This has 
tremendous potential for improving instruction as well as being essential for good performance assessmertt. 

The Need For An Array of Assessment Tools 

These sharper images of success have brought with them the realization that most of these valued outcomes 
are in fact far more complex than we had p:eviousty realized. This, in turn, has given new momentum to the 
drive for richer, more complete assessments of student achievement. Traditional paper and perual tests, 
while still valuable tools, wiil never again be regarded as sufficient as a means of profiling student 
achievement. Rather, we now know that we must rely on a broad an^ay of assessment tools to depict a 
broad anay of valued outcomes. 



The Need For Training 

Our drive toward more diverse a^essments has sensitteed us to tlie need for new levels of assessment 
competence on ttie part of ail educators and assessors. Sound perfonrance assessments can only be 
developed gavS conducted by tiiose who (a) possess a clear, highly-differentiated vision of the valued 
outcome, and ^) have mastered tfie craft knowledge needed to transform that target into appropriate 
petfoimance exerdses and pefformance crfteila. Unfortunately, we have discovered that many charged with 
assessing student competence are not, in fact, qualified to do so. 

Cost/Tlme/Technlcal issues 

Experimental application of perfoimance assessment n^lKKtoiogy in taige-scale assessment contexts has 
revealed the great cost of this labor-intensive assessment alternative. These coste become pn>hS>itive when 
conskSerBd in of the lessons we are learning about the psychometric <^IRy issues that must be 
addressed with performance assessments (Arter, 1991; Rothman, 1990; Valencia, 1989; Frschtltng, 1991 ; 
Unn st ai. 1992). To meet accepted standante of vaKdIiy (generaiizabBity) and reOabiDty (^mai 
consistency), assessments often must include a variety of samples of student performance, if each sample 
canles with it h^h scoring costs, then the overall costs of an assessment that is sufficient in its breadth of 
exerctees can be veiy high. Further, to meet accepted starvJards of reliability in the sense of objectivity or 
intenater agreement, very thorough rater trairUr^ is essential and multipte judges are required to control for 
measurement error due to rater. All of this adds cost. In tin^ of rapidly declinir^ resources for education in 
general, rising assessment costs are a problem. 

issues Associated Willi Kigti Staices Testing 

Many large-scale assessments are also high-staltes assessment-high school graduation, admission to 
college, report cards on sctioois, etc. These uses lead to ttieir own problems, and indeed they are the same 
problems encountered previous^ with high-stal<es testing--restrk^ng cunicuium, teaching the test (not Just to 
the peilomiance criteria), negative effects on students and teachers, the proliferation of a test-preparation 
industry that may or may really "Vvorit", and results that are. therefore, mt valid. Just nwing torn multiple- 
choice tests to performance assessments will no\ solve these problems. 

Performance Assessment As An instructionai Tooi 

This leaves educators on the horns of a dilemma. Many of the outcomes to be assessed are too complex to 
pennit reliance on traditional paper and pencil objective tests. So we cannot return to yesteryear and rely 
solely on those. We mu^ move forward and entirace periormance assessmertt alternatives to create a 
con^ete profile of student achievement. But we cannot do that either, because the costs of such 
assessments is so astronomically high. 

One possible solution to this diiemma might be to turn to teachers as the providers of the rTX>re complex 
student achievement information we desire. After all, they have the opportunity to gather that infonnation 
needed overtime, sampling with diverse exercises and providing the replications needed to produce valid 
assessment results. There are at least two problems with this plan. First, decades of neglect of teacher 
training In assessment has left teachers with neither the clear vision of achievement targets nor the 
periormance assessment design expertise needed to play this critk;al rote in the future of educational 
assessment. 

Second, many periormance assessments are designed in ways that tend to limit their usefulness 
instnictionally. resulting in little incentive for teachers to wantio put in the effort to gather high quality 
periormance information. 

Some features of cun'ent performance assessments which tend to limit instmctional usefulness are 
performance criteria that are tied directly to Individual tasks (so that the criteria change for each exercise); 
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holistic scoring: procedures that do ftoX involve teachers in the scoring of performances; high stakes uses; 
and activities that do not make the students an interested partner. With respect to the latter poirtt, tieing 
important dec^lons to the results of a pertormance assessment does not make the student an interested 
partner: an intimidated partner pertiaps, but not one who is interested in an honest outside or self-appraisai 
of his or Iter status ar»j progress. 

An exam})le of an instructionally useful performance assessment the six-trait analytical procedure for 
writir^ used in Oregon. The same six traits descit>e good writing in general. Thus, the criteria are t)road 
and not tied to particular exercises. This allows a consistent vocabulary for discussing writing across 
teachers and tasks. Students are made partners In the process by involving them in analyzing their own 
work and that of others using the criteria. Many teachers integrate the six traits with the writing process 
during peer review arvl revision as a consistem and powerful way to (»x>vide f eecflDack using a common 
vocabulary. Others stnicture instruction arourel the traits so tturt, for example, students wilt spend some time 
thinking about and analyzing how organization can affect what an author is trying to say. 

Teachers are made partners in this process by slwv^ them how to use the nKKtel in instniction. and bf 
involving as many teachers as possible in soorir^ statewkle assessments. This procedure not only trains 
teachers in using the model, but also allows them to systematteatly apply them to targe numbers of student 
papers, and to get a g(x>d kfea of what student writing is really like at the various grade levels. 

Articulating and ar^lying performance criteria help teachers to \w>w what "good" tooks fike arxl Ik>w students 
devek>p toward our goals for them. M Murphy and Smith (1990) state: The benefits of portfolios lie as 
much in the disc^stons they generate among teachers-and among teachers and students~as in the wealth 
of Information they provkJe." This is equally tme of all good performance assessment because it forces us to 
articulate what we value in a performance and to apply it consistently to student work. Teachers and 
students team in the process. 

We wouM like to suggest that to have performance assessments that mean anything, we need to first ensure 
that teachers perceive them as good instnictkjnal tools arKi know how to use them as instructional tools. 
This will require a great deal of training. Thus, as we move through the eariy 1990s, we face a major 
unresolved performance assessment problem: We need teachers and want to take advantage of all they 
offer, but they simply are not equipped to do the job. Further, we appear not to have the resources with 
which to solve this immense national problem. 
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FOOTNOTES 



1 . For iTK)re information on this system, contact Mtohael Datton, Oregon Department of Education. 700 
Pringle Pailcway S.E.. Salem. OR 97310. 

2. The Test Center at Northwest has been collecting alternative assessment devices for several 
years. Annotated ba>tk)graphies of such instruments are available In the areas of reading, math, 
science and portfolios. Contact Judy Arter. Northwest Regional Eiiucational Lak>oratory, 101 S.W. 
Main. Suite 500. Portland. OR 97204. 

3. For more information about the NASSP Assessment Centers, write to the National Association of 
Secondary Sdiool Principals. 1904 Association Drive. Reston. VA 22091 . 
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Table 1 

AppUcatfons of Performance Assessment in Education 



students 



Teachers 



Principals 



Purposes Teacher instmctionat 

decision making 
Tool for instnjction 
Student dedsion 

making 
Acoountat>ility 
Program evaluation 
Student certification, 

graduatton, promotbn 
College admission 
Communicate what is 

valued 



.*dmission to training 
Certification 

Professional development 
Accountability 



Hiring 

Professional development 
Accountability 
Admls$)ion to training 



Targets Subject matter knowledge 
Thinking processes 
Products, e.g., research 
reports 

Achievement related 
behavior, e.g. communication 

Affect, e.g., persistence, 
flexibility, self-confidence 

Methods Classroom ot>servation 
PortfoHos 

Stnjclured performance 
assessment tasks 
Simulations 



Classroom management 
Instnictional skills 
Subject matter Knowledge 
Pedagogical knowledge 
Communication skills 



Classroom observatk)n 

Assessment centers 
Simulatk>ns 



Management skills 
Leadership behavior 
Personality traits/styles 
Problem solving 
Communication skills 



On-the-job observation 
Assessment centers 
In-basket 

Structured interviews 
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